Information Content Measures of Semantic Similarity Perform Better Without Sense-Tagged Text
نویسنده
چکیده
This paper presents an empirical comparison of similarity measures for pairs of concepts based on Information Content. It shows that using modest amounts of untagged text to derive Information Content results in higher correlation with human similarity judgments than using the largest available corpus of manually annotated sense–tagged text.
منابع مشابه
Maximizing Semantic Relatedness to Perform Word Sense Disambiguation
This article presents a method of word sense disambiguation that assigns a target word the sense that is most related to the senses of its neighboring words. We explore the use of measures of similarity and relatedness that are based on finding paths in a concept network, information content derived from a large corpus, and word sense glosses. We observe that measures of relatedness are useful ...
متن کاملValuing Semantic Similarity
Similarity is a tool widely used in various domains such as DNA sequence analysis, knowledge representation, natural language processing, data mining, information retrieval, information flow etc. Computing semantic similarity between two entities is a non-trivial task. There are many ways to define semantic similarity. Some measures have been proposed combining both statistical information and ...
متن کاملA Comprehensive Comparative Study of Word and Sentence Similarity Measures
Sentence similarity is considered the basis of many natural language tasks such as information retrieval, question answering and text summarization. The semantic meaning between compared text fragments is based on the words’ semantic features and their relationships. This article reviews a set of word and sentence similarity measures and compares them on benchmark datasets. On the studied datas...
متن کاملKnowledge-based method for determining the meaning of ambiguous biomedical terms using information content measures of similarity.
In this paper, we introduce a novel knowledge-based word sense disambiguation method that determines the sense of an ambiguous word in biomedical text using semantic similarity or relatedness measures. These measures quantify the degree of similarity between concepts in the Unified Medical Language System (UMLS). The objective of this work was to develop a method that can disambiguate terms in ...
متن کاملEvaluating measures of semantic similarity and relatedness to disambiguate terms in biomedical text
INTRODUCTION In this article, we evaluate a knowledge-based word sense disambiguation method that determines the intended concept associated with an ambiguous word in biomedical text using semantic similarity and relatedness measures. These measures quantify the degree of similarity or relatedness between concepts in the Unified Medical Language System (UMLS). The objective of this work is to d...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2010